我们提出了仅使用目标文本提示的3D模型的零击生成技术。在没有任何3D监督的情况下,我们的方法变形了极限细分表面的控制形状及其纹理地图和正常地图,以获得与输入文本提示相对应的3D资产,并且可以轻松地部署到游戏或建模应用程序中。我们仅依靠预先训练的剪辑模型,该模型将输入文本提示与我们3D模型的渲染图像进行了分化。虽然先前的作品集中在风格化或对生成模型的必要培训上,但我们直接对网格参数进行优化,以生成形状,纹理或两者兼而有之。为了限制优化以产生合理的网格和纹理,我们使用图像增强量引入了许多技术,并使用预验证的先验,该技术在给定文本嵌入的情况下生成了剪贴图像嵌入。
translated by 谷歌翻译
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
人工智能领域的成就用于计算和制造智能机器的进步,以促进人类和改善用户体验。情绪对人来说是基本的,影响了对应,学习和方向等思维和普通练习。语音情感识别是在这方面感兴趣的领域,在这项工作中,我们提出了一种新型的MEL频谱学习方法,其中我们的模型使用数据点从普遍的Crema-d数据集中从给定的WAV表格音符中学习情感。我们的模型使用对数MEL光谱图作为特征,其中MELS = 64。与解决情感语音识别问题的其他方法相比,训练时间较少。
translated by 谷歌翻译
在农业环境中的现代除草剂应用通常依赖于将除草剂分配给作物和杂草相似的或便携式喷雾器的大型喷雾器,这些喷雾器需要劳动密集型手动操作。前一种方法导致过度使用除草剂并减少作物产量,而后者在大规模操作中经常站立。本文介绍了能够基于计算机视觉的导航,杂草检测,完整的现场覆盖以及\ $ 400下的计算机视觉的行作物的杂草管理的第一个完全自主机器人。目标应用程序是在裁剪领域中的自主行行杂草控制,例如,亚麻和油菜,在农作物之间的间距像一只脚一样小。所提出的机器人足够小,可以在植物生长的所有阶段之间通过植物生长的阶段,同时检测杂草和喷洒除草剂。充电系统包括新设计的机器人硬件,斜坡,机器人充电臂和移动充电站。采用集成视觉算法,有效地帮助充电器对齐。结合,它们使机器人能够在现场中连续工作而不获得电力。此外,将与预处理技术相结合的基于颜色的轮廓算法用于依赖于从车载单手套摄像机的输入上的鲁棒导航。将这种紧凑的机器人纳入农场可以帮助自动化杂草控制,即使在增长的后期阶段,并通过精确定位杂草减少除草剂。机器人平台在北达科他州的亚麻籽领域进行了现场测试。
translated by 谷歌翻译
微生物,特别是微型游泳者,对生物学和流体动力学的领域感兴趣的运动效率和机械效率。设计鞭打的微型和宏观机器人的挑战是从弹性和流体动力学的相互作用中随后的细长结构(例如棒状鞭毛)的几何非线性变形。某些类型的细菌如大肠杆菌通过在低雷诺流中旋转多个丝状结构来推动自己。这种多鞭状的推进机制与其他类型的细菌(如富轴霍乱)呈现的单鞭状机制定性不同。差异包括鞭毛形成束,以提高细胞运动性的方向稳定性,为细胞移动提供冗余,并提供鞭毛成为递送材料本身的能力。最重要的是,多鞭状的生物系统可以激发新型软机器,用于在人体内施用药物运输和递送。我们提出了一种宏观软机械硬件平台和用于多鞭状机器人的物理合理的仿真模型的计算框架。流体结构相互作用仿真将离散弹性棒算法与正则化的阶段段的方法耦合。由于Spillmann和Teschner,两个鞭毛之间的联系由基于惩罚的方法处理。我们在我们的实验和仿真结果之间显示比较,并验证模拟工具是否可以捕获此问题的基本物理。将多抹布机器人的稳定性和效率与单鞭状的对应物进行比较。
translated by 谷歌翻译
Deep neural networks (DNNs) are vulnerable to a class of attacks called "backdoor attacks", which create an association between a backdoor trigger and a target label the attacker is interested in exploiting. A backdoored DNN performs well on clean test images, yet persistently predicts an attacker-defined label for any sample in the presence of the backdoor trigger. Although backdoor attacks have been extensively studied in the image domain, there are very few works that explore such attacks in the video domain, and they tend to conclude that image backdoor attacks are less effective in the video domain. In this work, we revisit the traditional backdoor threat model and incorporate additional video-related aspects to that model. We show that poisoned-label image backdoor attacks could be extended temporally in two ways, statically and dynamically, leading to highly effective attacks in the video domain. In addition, we explore natural video backdoors to highlight the seriousness of this vulnerability in the video domain. And, for the first time, we study multi-modal (audiovisual) backdoor attacks against video action recognition models, where we show that attacking a single modality is enough for achieving a high attack success rate.
translated by 谷歌翻译
We present the interpretable meta neural ordinary differential equation (iMODE) method to rapidly learn generalizable (i.e., not parameter-specific) dynamics from trajectories of multiple dynamical systems that vary in their physical parameters. The iMODE method learns meta-knowledge, the functional variations of the force field of dynamical system instances without knowing the physical parameters, by adopting a bi-level optimization framework: an outer level capturing the common force field form among studied dynamical system instances and an inner level adapting to individual system instances. A priori physical knowledge can be conveniently embedded in the neural network architecture as inductive bias, such as conservative force field and Euclidean symmetry. With the learned meta-knowledge, iMODE can model an unseen system within seconds, and inversely reveal knowledge on the physical parameters of a system, or as a Neural Gauge to "measure" the physical parameters of an unseen system with observed trajectories. We test the validity of the iMODE method on bistable, double pendulum, Van der Pol, Slinky, and reaction-diffusion systems.
translated by 谷歌翻译
Unmanned aerial vehicle (UAV) swarms are considered as a promising technique for next-generation communication networks due to their flexibility, mobility, low cost, and the ability to collaboratively and autonomously provide services. Distributed learning (DL) enables UAV swarms to intelligently provide communication services, multi-directional remote surveillance, and target tracking. In this survey, we first introduce several popular DL algorithms such as federated learning (FL), multi-agent Reinforcement Learning (MARL), distributed inference, and split learning, and present a comprehensive overview of their applications for UAV swarms, such as trajectory design, power control, wireless resource allocation, user assignment, perception, and satellite communications. Then, we present several state-of-the-art applications of UAV swarms in wireless communication systems, such us reconfigurable intelligent surface (RIS), virtual reality (VR), semantic communications, and discuss the problems and challenges that DL-enabled UAV swarms can solve in these applications. Finally, we describe open problems of using DL in UAV swarms and future research directions of DL enabled UAV swarms. In summary, this survey provides a comprehensive survey of various DL applications for UAV swarms in extensive scenarios.
translated by 谷歌翻译
Compared to regular cameras, Dynamic Vision Sensors or Event Cameras can output compact visual data based on a change in the intensity in each pixel location asynchronously. In this paper, we study the application of current image-based SLAM techniques to these novel sensors. To this end, the information in adaptively selected event windows is processed to form motion-compensated images. These images are then used to reconstruct the scene and estimate the 6-DOF pose of the camera. We also propose an inertial version of the event-only pipeline to assess its capabilities. We compare the results of different configurations of the proposed algorithm against the ground truth for sequences of two publicly available event datasets. We also compare the results of the proposed event-inertial pipeline with the state-of-the-art and show it can produce comparable or more accurate results provided the map estimate is reliable.
translated by 谷歌翻译
With Twitter's growth and popularity, a huge number of views are shared by users on various topics, making this platform a valuable information source on various political, social, and economic issues. This paper investigates English tweets on the Russia-Ukraine war to analyze trends reflecting users' opinions and sentiments regarding the conflict. The tweets' positive and negative sentiments are analyzed using a BERT-based model, and the time series associated with the frequency of positive and negative tweets for various countries is calculated. Then, we propose a method based on the neighborhood average for modeling and clustering the time series of countries. The clustering results provide valuable insight into public opinion regarding this conflict. Among other things, we can mention the similar thoughts of users from the United States, Canada, the United Kingdom, and most Western European countries versus the shared views of Eastern European, Scandinavian, Asian, and South American nations toward the conflict.
translated by 谷歌翻译